Negative MCQs | Statistics & Probability

MCQ Questions - Topic-wise
Topic 1: Numbers & Numerical Applications	Topic 2: Algebra	Topic 3: Quantitative Aptitude
Topic 4: Geometry	Topic 5: Construction	Topic 6: Coordinate Geometry
Topic 7: Mensuration	Topic 8: Trigonometry	Topic 9: Sets, Relations & Functions
Topic 10: Calculus	Topic 11: Mathematical Reasoning	Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming	Topic 14: Index Numbers & Time-Based Data	Topic 15: Financial Mathematics
Topic 16: Statistics & Probability

Negative Questions MCQs for Sub-Topics of Topic 16: Statistics & Probability Content On This Page
Introduction to Statistics: Data and Organization	Frequency Distributions: Tables and Types	Graphical Representation of Data: Basic Charts
Graphical Representation: Frequency Distributions	Graphical Representation: Cumulative Frequency Graphs	Measures of Central Tendency: Introduction and Mean
Measures of Central Tendency: Median	Measures of Central Tendency: Mode and Relationship	Measures of Dispersion: Range and Mean Deviation
Measures of Dispersion: Variance and Standard Deviation	Measures of Relative Dispersion and Moments	Skewness and Kurtosis
Percentiles and Quartiles	Correlation	Introduction to Probability: Basic Terms and Concepts
Axiomatic Approach and Laws of Probability	Conditional Probability	Probability Theorems: Multiplication Law and Total Probability
Bayes’ Theorem	Random Variables and Probability Distributions	Measures of Probability Distributions: Expectation and Variance
Binomial Distribution	Poisson Distribution	Normal Distribution
Inferential Statistics: Population, Sample, and Parameters	Inferential Statistics: Concepts and Hypothesis Testing	Inferential Statistics: t-Test

Introduction to Statistics: Data and Organization

Question 1. Which of the following is NOT a correct statement about 'Data'?

(A) It is a collection of facts and figures.

(B) It can be numerical or qualitative.

(C) It is always derived from a primary source.

(D) It forms the basis for statistical analysis.

Answer:

Question 2. Raw data is unprocessed data. Which of the following is NOT a characteristic of raw data?

(A) It is collected directly from the source.

(B) It is organized into tables or graphs.

(C) It is also known as ungrouped data.

(D) It is the initial form of data before analysis.

Answer:

Question 3. Which of the following is NOT considered a quantitative variable?

(A) Number of students in a class.

(B) Height of a person.

(C) Eye colour.

(D) Daily temperature.

Answer:

Question 4. Which of the following is NOT considered a discrete variable?

(A) Number of cars owned by a family.

(B) Number of goals scored in a match.

(C) The weight of a bag of rice.

(D) The number of defective items in a box.

Answer:

Question 5. Which of the following is NOT a typical stage in the process of data handling?

(A) Collection.

(B) Organization.

(C) Destruction.

(D) Analysis.

Answer:

Question 6. Which of the following is NOT a common method for collecting primary data?

(A) Conducting a direct personal interview.

(B) Collecting data from a government census report.

(C) Sending out questionnaires for self-completion.

(D) Using direct observation of events.

Answer:

Question 7. Secondary data is data that has already been collected by someone else. Which of the following is NOT a typical source of secondary data?

(A) Publications from the Registrar General of India.

(B) Data from a specifically commissioned household survey conducted by the researcher.

(C) Reports published by international organizations like the World Bank.

(D) Data available in research journals or newspapers.

Answer:

Question 8. Organizing data involves arranging it in a systematic manner. Which of the following is NOT a direct benefit of organizing data?

(A) Making data easier to understand.

(B) Facilitating further analysis.

(C) Reducing the number of observations in the dataset.

(D) Making it simpler to identify key features.

Answer:

Question 9. Data interpretation is the stage where conclusions are drawn from analyzed data. Which of the following is NOT a part of the data interpretation process?

(A) Identifying patterns and trends in the data.

(B) Drawing meaningful conclusions based on analysis results.

(C) Deciding on appropriate graphical representation methods.

(D) Making decisions based on the insights gained from the data.

Answer:

Question 10. Which of the following is NOT a correct distinction between a variable and a constant?

(A) A variable can change its value, while a constant's value is fixed.

(B) Variables are always numerical, while constants can be anything.

(C) In a study, temperature is a variable, while the number of days in a week is a constant.

(D) Variables are characteristics measured, constants are fixed quantities.

Answer:

Frequency Distributions: Tables and Types

Question 1. Which of the following is NOT a key element typically found in a frequency distribution table?

(A) Classes or categories.

(B) Frequencies.

(C) Cumulative frequencies.

(D) Measures of central tendency (Mean, Median, Mode).

Answer:

Question 2. In grouped frequency distribution, which of the following is NOT a correct statement about class intervals?

(A) They are ranges of values for grouping data.

(B) They have a lower limit and an upper limit.

(C) All class intervals in a distribution must have the same size.

(D) They help condense a large amount of data.

Answer:

Question 3. Consider the class interval 10-20 (exclusive). Which of the following is NOT a correct statement about this interval?

(A) The lower limit is 10.

(B) The upper limit is 20.

(C) An observation with value 20 is included in this class.

(D) The class size is 10.

Answer:

Question 4. Which of the following is NOT a valid way to calculate the class mark (mid-point) of a class interval with lower limit L and upper limit U?

(A) $(L + U) / 2$.

(B) (Lower boundary + Upper boundary) / 2.

(C) L + (Class Size / 2).

(D) L + U.

Answer:

Question 5. Cumulative frequency tables show the running total of frequencies. Which of the following is NOT true about cumulative frequencies?

(A) 'Less than' cumulative frequency for a class shows the total count below its upper limit.

(B) 'More than' cumulative frequency for a class shows the total count above or equal to its lower limit.

(C) The cumulative frequency always increases as you move down the table (for 'less than' type).

(D) The cumulative frequency for any class is always less than the frequency of that class.

Answer:

Question 6. A frequency distribution table is created for discrete data with a small number of distinct values. Which of the following is NOT the most appropriate type of table?

(A) Ungrouped frequency distribution table.

(B) Grouped frequency distribution table with very wide class intervals.

(C) Simple tally chart leading to an ungrouped table.

(D) Table listing each distinct value and its frequency.

Answer:

Question 7. The total frequency of a dataset is 50. Consider a frequency distribution table for this data. Which of the following is NOT a valid observation?

(A) The sum of all class frequencies is 50.

(B) The 'less than' cumulative frequency for the last class is 50.

(C) The 'more than' cumulative frequency for the first class is 50.

(D) The cumulative frequency for the first class (frequency > 0) must be greater than 0 and 50.

Answer:

Question 8. Consider an inclusive class interval like 20-29. Which of the following is NOT a correct statement?

(A) Both 20 and 29 are included in this class.

(B) The next class interval would likely start from 30.

(C) The class boundary corresponding to the upper limit 29 is likely 29.5.

(D) This type of interval is ideal for continuous data used in histograms without adjustment.

Answer:

Question 9. When grouping data into class intervals, which of the following is NOT a generally recommended practice?

(A) Using a consistent class size (unless necessary due to data spread).

(B) Ensuring that class intervals are mutually exclusive.

(C) Having a very small number of classes (e.g., 2 or 3).

(D) Covering the entire range of the data with the classes.

Answer:

Question 10. Which of the following is NOT a primary purpose of constructing frequency distribution tables?

(A) To summarize large datasets.

(B) To make raw data easier to understand.

(C) To facilitate the calculation of descriptive statistics.

(D) To eliminate the need for any graphical representation.

Answer:

Graphical Representation of Data: Basic Charts

Question 1. Which of the following basic charts is NOT typically used to represent the frequency distribution of *continuous* data?

(A) Bar Graph.

(B) Pie Chart (can show proportions of grouped continuous data, but not the distribution shape).

(C) Pictograph (can show counts for grouped continuous data, but not distribution shape).

(D) Single Bar Graph.

Answer:

Question 2. In a Bar Graph, which of the following is NOT a standard convention?

(A) The height (or length) of the bar is proportional to the value it represents.

(B) The bars are usually separated by equal spaces.

(C) The bars are always drawn vertically.

(D) All bars have uniform width.

Answer:

Question 3. A Pie Chart represents the proportion of parts to a whole. Which of the following is NOT a necessary step when constructing a pie chart?

(A) Calculate the total value of the data.

(B) Calculate the angle of each sector.

(C) Ensure the sum of percentages for all categories is $100\%$.

(D) Calculate the mean of the data values.

Answer:

Question 4. If a pie chart shows the distribution of students' favorite sports, and Cricket is represented by a sector with an angle of $180^\circ$, which of the following is NOT true?

(A) Cricket is the most popular sport.

(B) Exactly 180 students chose Cricket.

(C) 50% of the students chose Cricket.

(D) The total angle of the circle is $360^\circ$.

Answer:

Question 5. A double bar graph is used for comparison. Which of the following is NOT a suitable scenario for using a double bar graph?

(A) Comparing sales of two products over several months.

(B) Comparing the marks of boys and girls in different subjects.

(C) Showing the proportion of different expenses within a single family's budget.

(D) Comparing the population of urban and rural areas across different states.

Answer:

Question 6. Which of the following is NOT a characteristic feature of a pictograph?

(A) Uses symbols to represent data quantities.

(B) A key is provided to indicate the value of each symbol.

(C) It is highly accurate for showing precise data points, especially with large numbers.

(D) It is visually appealing and easy to understand for a general audience.

Answer:

Question 7. When creating a Bar Graph for categorical data, which of the following is NOT placed on the horizontal axis?

(A) Names of categories (e.g., Mumbai, Delhi, Kolkata).

(B) Discrete values (e.g., 1, 2, 3 for number of cars).

(C) The frequencies or values of the categories.

(D) Labels representing the different items being compared.

Answer:

Question 8. Graphical representation of data is beneficial. Which of the following is NOT a key advantage of using charts and graphs?

(A) Makes data easier to understand quickly.

(B) Helps in identifying trends and patterns.

(C) Reduces the data analysis process to just visual inspection.

(D) Facilitates comparison between different data points or groups.

Answer:

Question 9. If a pictograph uses half-symbols, quarter-symbols, etc., which of the following is NOT a reason for doing so?

(A) To represent quantities that are not whole multiples of the symbol's value.

(B) To improve the visual appeal of the graph.

(C) To make the graph harder to interpret.

(D) To allow for more precise representation of data points.

Answer:

Question 10. Which of the following is NOT a basic chart typically introduced early in statistics for data representation?

(A) Histogram.

(B) Bar Graph.

(C) Pie Chart.

(D) Pictograph.

Answer:

Graphical Representation: Frequency Distributions

Question 1. Which of the following is NOT a key characteristic of a Histogram?

(A) It represents the frequency distribution of continuous data.

(B) The bars are adjacent to each other.

(C) The height of each bar is proportional to the frequency density.

(D) It is suitable for representing the frequency of distinct categories (like favourite colours).

Answer:

Question 2. A Frequency Polygon is drawn. Which of the following is NOT a correct statement about its construction or interpretation?

(A) It is formed by joining the midpoints of the tops of histogram bars.

(B) It can be drawn by plotting frequencies against class marks.

(C) The area under the frequency polygon is equal to the area under the corresponding histogram.

(D) It is typically used to represent cumulative frequencies.

Answer:

Question 3. When constructing a histogram with unequal class widths, which of the following is NOT true about the bars?

(A) The width of each bar represents the class interval.

(B) The height of each bar represents the frequency of the class.

(C) The area of each bar is proportional to the frequency of the class.

(D) The y-axis represents frequency density (frequency per unit width).

Answer:

Question 4. To compare the frequency distributions of two different datasets (e.g., marks of two classes) graphically, which of the following is NOT the most appropriate method?

(A) Drawing two separate histograms side-by-side.

(B) Drawing two overlaid frequency polygons on the same axes.

(C) Drawing two separate pie charts for each dataset's mark distribution.

(D) Drawing two overlaid histograms on the same axes.

Answer:

Question 5. When drawing a frequency polygon directly from a frequency table without first drawing a histogram, which of the following is NOT needed?

(A) Class limits.

(B) Class marks.

(C) Frequencies.

(D) Cumulative frequencies.

Answer:

Question 6. Which of the following statements is NOT true about the relationship between histograms and frequency polygons?

(A) A frequency polygon can be derived from a histogram.

(B) The area under the frequency polygon is equal to the total frequency (assuming proper closure).

(C) Histograms are better for visualizing individual class frequencies, while polygons are better for showing the overall shape.

(D) A histogram is suitable for discrete data, while a frequency polygon is only for continuous data.

Answer:

Question 7. If a frequency distribution has open-ended classes (e.g., "Below 10", "100 and above"), which of the following is NOT a challenge when drawing a histogram or frequency polygon?

(A) Determining the exact class marks for the open-ended classes.

(B) Representing the unbounded nature of the open-ended classes graphically.

(C) Calculating the frequencies for the given classes.

(D) Maintaining equal class widths across all intervals.

Answer:

Question 8. When converting inclusive class intervals (e.g., 1-5, 6-10) to continuous class boundaries for a histogram, which of the following is NOT true?

(A) The upper boundary of one class equals the lower boundary of the next class.

(B) The adjustment ensures that there are no gaps between bars in the histogram.

(C) The class mark changes significantly after conversion.

(D) The class size remains the same after conversion (if consistent initially).

Answer:

Question 9. Which of the following is NOT a way in which a frequency polygon is typically closed to make the area under it equal to the corresponding histogram's area?

(A) Joining the midpoint of the first class (with frequency > 0) to the origin (0,0).

(B) Joining the midpoint of the first class to the midpoint of a hypothetical preceding class with frequency 0.

(C) Joining the midpoint of the last class (with frequency > 0) to the midpoint of a hypothetical succeeding class with frequency 0.

(D) Connecting the first plotted point to the last plotted point directly.

Answer:

Question 10. The y-axis in a histogram with equal class widths represents frequency. In a histogram with unequal class widths, which of the following does the y-axis NOT represent?

(A) Frequency.

(B) Frequency Density.

(C) Height adjusted for class width.

(D) Proportion of total frequency per unit width.

Answer:

Graphical Representation: Cumulative Frequency Graphs

Question 1. Which of the following is NOT a common name or type of cumulative frequency graph?

(A) Ogive.

(B) Less Than Ogive.

(C) More Than Ogive.

(D) Frequency Ogive.

Answer:

Question 2. When constructing a 'less than' ogive, which of the following is NOT plotted on the x-axis?

(A) Upper limits of the class intervals.

(B) Upper boundaries of the class intervals.

(C) Class marks of the class intervals.

(D) Values for which the cumulative frequency is calculated.

Answer:

Question 3. Which of the following is NOT a correct statement about the shape of ogives?

(A) A 'less than' ogive is generally increasing.

(B) A 'more than' ogive is generally decreasing.

(C) Both 'less than' and 'more than' ogives are typically S-shaped.

(D) The steepness of an ogive indicates the concentration of frequencies in that range.

Answer:

Question 4. Which of the following measures of central tendency is NOT typically estimated graphically using ogives?

(A) Median.

(B) First Quartile ($Q_1$).

(C) Third Quartile ($Q_3$).

(D) Mean.

Answer:

Question 5. The total frequency of a dataset is N. When estimating the Median graphically from a 'less than' ogive, which cumulative frequency value is NOT used on the y-axis?

(A) $N/2$.

(B) $P(X=x)$ for the median class.

(C) Half of the total number of observations.

(D) The cumulative frequency corresponding to the 50th percentile.

Answer:

Question 6. Which of the following is NOT a valid interpretation from the intersection point of the 'less than' and 'more than' ogives?

(A) The x-coordinate is the estimated Median.

(B) The y-coordinate is equal to half of the total frequency ($N/2$).

(C) The intersection point is the estimated Mode.

(D) The total number of observations is twice the y-coordinate of the intersection.

Answer:

Question 7. A 'less than' ogive is plotted for a distribution of student marks. Which of the following information cannot be directly read from this ogive?

(A) The number of students who scored less than 60 marks.

(B) The total number of students.

(C) The number of students who scored exactly 75 marks.

(D) The estimated Median mark.

Answer:

Question 8. When constructing a 'more than' ogive, which of the following is NOT used on the y-axis?

(A) Total frequency minus the cumulative frequency less than the lower boundary.

(B) Cumulative frequency greater than or equal to the lower boundary.

(C) Frequency of each specific class interval.

(D) The cumulative frequency starting from the total frequency (N).

Answer:

Question 9. Which of the following is NOT a primary purpose of using ogives?

(A) To determine the number of observations below or above a certain value.

(B) To estimate positional measures like quartiles and percentiles.

(C) To visualize the cumulative nature of the frequency distribution.

(D) To easily identify the class with the highest frequency.

Answer:

Question 10. When preparing data for plotting an ogive from a frequency table with inclusive classes, which of the following is NOT a necessary step?

(A) Calculating cumulative frequencies (either 'less than' or 'more than').

(B) Finding the class boundaries.

(C) Identifying the class marks.

(D) Determining the total frequency.

Answer:

Measures of Central Tendency: Introduction and Mean

Question 1. Which of the following is NOT considered a measure of central tendency?

(A) Mean.

(B) Median.

(C) Standard Deviation.

(D) Mode.

Answer:

Question 2. The Arithmetic Mean is the sum of observations divided by their number. Which of the following is NOT true about the Arithmetic Mean?

(A) It is affected by every observation in the dataset.

(B) It is resistant to extreme values (outliers).

(C) It is unique for a given set of data.

(D) It is often called the average.

Answer:

Question 3. For ungrouped data $x_1, x_2, ..., x_n$, which of the following is NOT a correct formula or property related to the Mean?

(A) $\bar{x} = \frac{\sum_{i=1}^n x_i}{n}$.

(B) $\sum_{i=1}^n (x_i - \bar{x}) = 0$.

(C) $\bar{x}$ can be calculated for qualitative data.

(D) Adding a constant 'c' to each $x_i$ results in a new mean $\bar{x} + c$.

Answer:

Question 4. For grouped data with class marks $x_i$ and frequencies $f_i$, which of the following is NOT a method for calculating the mean?

(A) Direct Method ($\bar{x} = \frac{\sum f_i x_i}{\sum f_i}$).

(B) Assumed Mean Method.

(C) Step-Deviation Method.

(D) Finding the middle value after calculating cumulative frequencies.

Answer:

Question 5. If every observation in a dataset is multiplied by a constant 'k' (where k $\neq$ 0), which of the following is NOT a correct statement about the new mean?

(A) The new mean is $k$ times the original mean.

(B) The new mean is the original mean plus $k$.

(C) If the original mean was $\bar{x}$, the new mean is $k\bar{x}$.

(D) This transformation affects the mean proportionally.

Answer:

Question 6. Which of the following is NOT a situation where the Mean might be misleading or less appropriate as a measure of central tendency?

(A) When the data contains extreme outliers.

(B) When the distribution is highly skewed.

(C) When the data is perfectly symmetric.

(D) When the data is qualitative (nominal or ordinal).

Answer:

Question 7. In the Assumed Mean method, a value 'A' is chosen to simplify calculations. Which of the following is NOT true about this method?

(A) The actual mean $\bar{x}$ can be found using the formula $\bar{x} = A + \frac{\sum f_i d_i}{\sum f_i}$, where $d_i = x_i - A$.

(B) The choice of 'A' affects the final calculated value of the mean.

(C) This method can be simpler when class marks are large.

(D) It is equivalent to the Direct Method in terms of the final mean value.

Answer:

Question 8. Which of the following is NOT a correct statement about the relationship between the number of observations and the calculation of the Mean?

(A) The total number of observations is the denominator in the mean formula for ungrouped data.

(B) For grouped data, the sum of frequencies is the total number of observations.

(C) The number of classes directly determines the mean.

(D) A larger number of observations generally provides a more reliable sample mean as an estimate of the population mean.

Answer:

Question 9. Which of the following is NOT a valid property of the arithmetic mean for numerical data?

(A) It balances the deviations on either side (sum of deviations is zero).

(B) The sum of squared deviations from the mean is minimum.

(C) It can be graphically estimated from a histogram as the peak of the highest bar.

(D) It is affected by changes in scale (multiplication) and origin (addition/subtraction).

Answer:

Question 10. When using the Step-Deviation method, which of the following is NOT a requirement or characteristic?

(A) The class intervals should generally have equal width.

(B) Deviations are divided by the class size to simplify calculations.

(C) This method is typically simpler than the Assumed Mean method when class intervals are equal and class marks are large.

(D) The final mean needs to be adjusted by multiplying the result by the class size.

Answer:

Measures of Central Tendency: Median

Question 1. Which of the following is NOT a definition or property of the Median?

(A) The middle value of an ordered dataset.

(B) Divides the data into two equal halves.

(C) Is significantly affected by extreme values (outliers).

(D) It is a positional average.

Answer:

Question 2. To find the Median of ungrouped data, which of the following is NOT a correct step or rule?

(A) Arrange the data in ascending or descending order.

(B) Find the most frequent value.

(C) If n is odd, the median is the value at the $(\frac{n+1}{2})^{th}$ position.

(D) If n is even, the median is the average of the values at the $(\frac{n}{2})^{th}$ and $(\frac{n}{2}+1)^{th}$ positions.

Answer:

Question 3. For grouped data, which of the following is NOT a necessary component when calculating the Median using the formula $M = L + \frac{(N/2 - cf)}{f} \times h$?

(A) L, the lower boundary of the median class.

(B) $N/2$, half of the total frequency.

(C) $f_1$, the frequency of the class succeeding the median class.

(D) cf, the cumulative frequency of the class preceding the median class.

Answer:

Question 4. Which of the following is NOT a correct statement about using the Median compared to the Mean?

(A) Median is preferred for skewed distributions.

(B) Median is preferred when there are outliers.

(C) Mean is preferred for symmetric distributions with interval/ratio data.

(D) Median is always easier to calculate than the Mean for grouped data.

Answer:

Question 5. The Median can be estimated graphically. Which of the following is NOT a correct method for graphically estimating the Median?

(A) From a 'less than' ogive by locating $N/2$ on the y-axis.

(B) From a 'more than' ogive by locating $N/2$ on the y-axis.

(C) From the intersection point of the 'less than' and 'more than' ogives.

(D) From the highest bar in a histogram.

Answer:

Question 6. Which of the following types of data is the Median NOT suitable for?

(A) Discrete numerical data.

(B) Continuous numerical data.

(C) Ordinal categorical data (where data can be ranked).

(D) Nominal categorical data (where data cannot be ranked).

Answer:

Question 7. If a constant 'c' is multiplied to every observation in a dataset, which of the following is NOT true about the new Median?

(A) The new Median is the original Median multiplied by $|c|$.

(B) The new Median is the original Median plus $c$.

(C) If $c>0$, the new Median is $c$ times the original Median.

(D) The position of the Median in the ordered data remains the same.

Answer:

Question 8. For grouped data with inclusive classes, which of the following is NOT a valid approach when preparing for Median calculation?

(A) Use the given limits directly in the Median formula L should be lower boundary).

(B) Convert the inclusive limits to exclusive boundaries.

(C) Calculate cumulative frequencies.

(D) Identify the median class based on $N/2$.

Answer:

Question 9. Which of the following is NOT a correct interpretation of the Median value of 60 for a dataset?

(A) Half of the observations are less than or equal to 60.

(B) Half of the observations are greater than or equal to 60.

(C) 60 is the most frequent value in the dataset.

(D) 50% of the data falls below 60.

Answer:

Question 10. The Median is a resistant measure of central tendency. Which of the following is NOT implied by this property?

(A) It is not significantly affected by extreme values.

(B) It gives a better sense of the typical value in skewed distributions than the mean.

(C) Its value changes dramatically if a single extreme observation is added.

(D) It focuses on the center of the data distribution.

Answer:

Measures of Central Tendency: Mode and Relationship

Question 1. Which of the following is NOT a property of the Mode?

(A) It is the most frequent value.

(B) Every dataset has a unique mode.

(C) It can be used for qualitative data.

(D) It is not affected by extreme values.

Answer:

Question 2. For a grouped frequency distribution, which of the following is NOT the correct way to identify the modal class?

(A) Find the class interval with the highest frequency.

(B) Find the class interval with the largest cumulative frequency.

(C) Identify the class that corresponds to the peak of the histogram.

(D) Look for the class interval that occurs most often in the frequency column (this is redundant definition).

Answer:

Question 3. Consider the empirical formula relating Mean, Median, and Mode for moderately skewed distributions: Mode $\approx$ 3 Median - 2 Mean. Which of the following is NOT implied by this relationship?

(A) Mean - Mode $\approx$ 3 (Mean - Median).

(B) The relationship is exact for all distributions.

(C) For symmetric distributions (Mean=Median=Mode), the formula holds (e.g., $x = 3x - 2x$).

(D) If Mean > Median, then Mode is likely less than Median (for moderate skew).

Answer:

Question 4. For a negatively skewed distribution, which of the following relationships between Mean, Median, and Mode is NOT typically true?

(A) Mean < Median.

(B) Median < Mode.

(C) Mean > Mode.

(D) Mean < Median < Mode.

Answer:

Question 5. The Mode can be estimated graphically from a histogram. Which of the following is NOT a step in this graphical estimation?

(A) Identify the modal class (bar with the highest frequency).

(B) Draw lines from the top corners of the modal bar to the opposite top corners of the adjacent bars.

(C) Draw a perpendicular line from the intersection point of these lines to the y-axis.

(D) The value on the x-axis at this perpendicular is the estimated mode.

Answer:

Question 6. In the formula for the Mode of grouped data, $Mode = L + \frac{(f_1 - f_0)}{(2f_1 - f_0 - f_2)} \times h$, which of the following is NOT a correct description of a component?

(A) L is the lower boundary of the modal class.

(B) $f_1$ is the frequency of the modal class.

(C) $f_0$ is the frequency of the class succeeding the modal class.

(D) $f_2$ is the frequency of the class succeeding the modal class.

Answer:

Question 7. Which of the following is NOT a measure of central tendency that is always unique for any given dataset?

(A) Mean.

(B) Median.

(C) Mode (can be bimodal or multimodal).

(D) All three are always unique.

Answer:

Question 8. Which of the following is NOT a correct statement about comparing Mean, Median, and Mode?

(A) Mean is best for symmetric data without outliers.

(B) Median is best for skewed data or data with outliers.

(C) Mode is best for nominal data or identifying the most frequent category.

(D) The Mean is always located between the Median and the Mode in skewed distributions.

Answer:

Question 9. If a dataset has values {10, 20, 30, 40, 50} where each value appears once, which of the following is NOT true about the mode?

(A) Each value has a frequency of 1.

(B) The dataset is multimodal (every value is a mode).

(C) The dataset has no unique mode.

(D) The mode is 30.

Answer:

Question 10. For a perfectly symmetric, unimodal distribution, which of the following is NOT true?

(A) Mean = Median = Mode.

(B) The empirical formula relating Mean, Median, and Mode does not apply.

(C) Skewness is zero.

(D) The distribution is balanced around the central point.

Answer:

Measures of Dispersion: Range and Mean Deviation

Question 1. Which of the following is NOT a measure of dispersion?

(A) Range.

(B) Mean Deviation.

(C) Median.

(D) Standard Deviation.

Answer:

Question 2. The Range is the difference between the maximum and minimum values. Which of the following is NOT a disadvantage of using the Range?

(A) It is highly affected by extreme values.

(B) It does not consider the distribution of data points between the extremes.

(C) It is difficult to calculate for simple datasets.

(D) It only uses two values from the dataset.

Answer:

Question 3. Mean Deviation is the average of absolute deviations from a central value. Which of the following is NOT true about Mean Deviation?

(A) It uses absolute values of deviations.

(B) It can be calculated from the Mean or the Median.

(C) Mean Deviation from the Mean is always less than Mean Deviation from the Median.

(D) It gives equal weight to all deviations regardless of their size (relative to squared deviations).

Answer:

Question 4. If a constant 'c' is added to every observation in a dataset, which of the following is NOT true about the Mean Deviation?

(A) The Mean Deviation increases by 'c'.

(B) The central value (Mean or Median) increases by 'c'.

(C) The individual deviations $|x_i - A|$ remain unchanged.

(D) The Mean Deviation remains unchanged.

Answer:

Question 5. If every observation in a dataset is multiplied by a positive constant 'k', which of the following is NOT true about the Range?

(A) The minimum value is multiplied by k.

(B) The maximum value is multiplied by k.

(C) The Range is multiplied by $k^2$.

(D) The new Range is k times the original Range.

Answer:

Question 6. Which of the following is NOT a correct statement about calculating Mean Deviation for grouped data?

(A) Class marks are used to represent the values in each class.

(B) Frequencies are multiplied by the absolute deviations.

(C) The sum of $f_i |x_i - A|$ is divided by the number of classes.

(D) The denominator is the total frequency, $\sum f_i$.

Answer:

Question 7. Which of the following is NOT a reason why Mean Deviation is sometimes considered less mathematically convenient than Variance or Standard Deviation?

(A) The use of absolute values makes algebraic manipulations difficult.

(B) It does not have as many desirable mathematical properties as measures based on squared deviations.

(C) It is always more affected by extreme values than Standard Deviation.

(D) It is not differentiable at certain points due to the absolute value function.

Answer:

Question 8. Which of the following is NOT true about the relationship between the Range and the Mean Deviation?

(A) Mean Deviation is always less than or equal to half the Range (roughly).

(B) Both are measures of absolute dispersion.

(C) Both are equally sensitive to extreme values.

(D) The Range is a simpler measure than the Mean Deviation.

Answer:

Question 9. For the data set {5, 5, 5, 5}, which of the following is NOT true?

(A) The Range is 0.

(B) The Mean Deviation from the Mean is 0.

(C) The Mean Deviation from the Median is 0.

(D) The Mean Deviation is undefined.

Answer:

Question 10. When is the Mean Deviation from the Mean NOT equal to the Mean Deviation from the Median?

(A) For perfectly symmetric distributions.

(B) For skewed distributions.

(C) When the data has outliers.

(D) When the Mean and Median are different.

Answer:

Measures of Dispersion: Variance and Standard Deviation

Question 1. Which of the following is NOT a correct definition or formula for Variance?

(A) Average of the squared deviations from the mean.

(B) $E[(X - E(X))^2]$ for a random variable X.

(C) The positive square root of the Standard Deviation.

(D) $E(X^2) - [E(X)]^2$ for a random variable X.

Answer:

Question 2. Which of the following is NOT a correct statement about the Standard Deviation?

(A) It is the positive square root of the Variance.

(B) Its units are the same as the units of the original data.

(C) It is always negative.

(D) It measures the typical distance of data points from the mean.

Answer:

Question 3. If a constant 'c' is added to every observation in a dataset, which of the following is NOT true about the Variance and Standard Deviation?

(A) The Variance remains unchanged.

(B) The Standard Deviation remains unchanged.

(C) The Standard Deviation increases by 'c'.

(D) The sum of squared deviations from the mean remains unchanged.

Answer:

Question 4. If every observation in a dataset is multiplied by a constant 'k', which of the following is NOT true about the Variance?

(A) The new Variance is the original Variance multiplied by k.

(B) The new Variance is the original Variance multiplied by $k^2$.

(C) $Var(kX) = k^2 Var(X)$.

(D) The effect is different from adding a constant.

Answer:

Question 5. Which of the following is NOT a reason why Standard Deviation is widely used and preferred over Mean Deviation in many statistical analyses?

(A) It has better mathematical properties due to the use of squares.

(B) It is always easier to calculate than Mean Deviation.

(C) It is more amenable to further mathematical treatment and theoretical development.

(D) It is less affected by extreme values than Mean Deviation.

Answer:

Question 6. For grouped data with class marks $x_i$ and frequencies $f_i$, which of the following is NOT a part of the Variance calculation?

(A) Calculating the mean ($\bar{x}$).

(B) Finding the deviations $(x_i - \bar{x})$.

(C) Taking the absolute values of the deviations $|x_i - \bar{x}|$.

(D) Squaring the deviations $(x_i - \bar{x})^2$ and multiplying by frequencies $f_i$.

Answer:

Question 7. If all the observations in a dataset are identical (e.g., 10, 10, 10), which of the following is NOT true?

(A) The Mean is 10.

(B) The Variance is 0.

(C) The Standard Deviation is 0.

(D) The Standard Deviation is equal to the Mean.

Answer:

Question 8. For a sample of size n, the sum of squared deviations from the sample mean is $\sum (x_i - \bar{x})^2$. Which of the following is NOT a correct statement about this sum?

(A) It is used in the calculation of sample variance.

(B) It is minimum when deviations are taken from the sample mean.

(C) It is always zero for any dataset.

(D) It is related to the variability of the sample data.

Answer:

Question 9. Which of the following is NOT a measure of dispersion that is expressed in squared units of the original data?

(A) Variance.

(B) Standard Deviation.

(C) Population Variance ($\sigma^2$).

(D) Sample Variance ($s^2$).

Answer:

Question 10. If the variance of a dataset is 81, which of the following is NOT a correct statement about its standard deviation?

(A) The standard deviation is $\sqrt{81}$.

(B) The standard deviation is 9.

(C) The standard deviation is -9.

(D) The standard deviation is a measure of the spread of the data.

Answer:

Measures of Relative Dispersion and Moments

Question 1. Measures of relative dispersion are used to compare the variability of different datasets. Which of the following is NOT a common measure of relative dispersion?

(A) Coefficient of Variation.

(B) Coefficient of Range.

(C) Coefficient of Quartile Deviation.

(D) Standard Deviation.

Answer:

Question 2. The Coefficient of Variation (CV) is calculated as $(\text{Standard Deviation} / \text{Mean}) \times 100$. Which of the following is NOT true about CV?

(A) It is usually expressed as a percentage.

(B) It is a dimensionless measure.

(C) It is always less than $100\%$.

(D) A higher CV indicates greater relative variability.

Answer:

Question 3. Comparing the variability of daily price changes of gold and silver is a typical application for measures of relative dispersion. Which of the following is NOT a reason for using relative dispersion in this case?

(A) Gold and silver prices are likely in different value ranges.

(B) The units of measurement (e.g., $\textsf{₹}$) are the same.

(C) Comparing absolute standard deviations directly might be misleading.

(D) Relative dispersion helps compare variability independent of the scale of the data.

Answer:

Question 4. Moments are used to describe the shape and characteristics of a distribution. Which of the following is NOT a correct statement about moments?

(A) Raw moments are calculated about the origin (0).

(B) Central moments are calculated about the mean.

(C) The first central moment is equal to the mean.

(D) The second central moment is equal to the variance.

Answer:

Question 5. If the mean of a distribution is 0, which of the following is NOT true about the Coefficient of Variation?

(A) The CV is undefined because the denominator is zero.

(B) The CV can still be calculated as long as the standard deviation is positive.

(C) The concept of relative variability with respect to a zero mean is not meaningful.

(D) Comparing CVs of distributions with zero or negative means is generally not advisable.

Answer:

Question 6. Which of the following is NOT a valid interpretation of a higher Coefficient of Variation for one dataset compared to another?

(A) The dataset has more variability relative to its mean.

(B) The dataset is more inconsistent.

(C) The dataset is more uniform.

(D) The standard deviation is a larger fraction of the mean.

Answer:

Question 7. If Mean = 50 and Standard Deviation = 10, which of the following is NOT a correct calculation or interpretation?

(A) Coefficient of Variation = $(10/50) \times 100 = 20\%$.

(B) The data is highly variable in absolute terms.

(C) The data is relatively consistent.

(D) The variance is 100.

Answer:

Question 8. Which of the following is NOT a measure of relative dispersion that is commonly used?

(A) Coefficient of Mean Deviation ($\frac{\text{Mean Deviation}}{\text{Mean or Median}}$).

(B) Coefficient of Quartile Deviation ($\frac{Q_3 - Q_1}{Q_3 + Q_1}$).

(C) Coefficient of Variation.

(D) Interquartile Range (IQR).

Answer:

Question 9. The concept of moments is fundamental in describing distributions. Which of the following is NOT a property that can be derived from moments?

(A) Central tendency (Mean from first raw moment).

(B) Dispersion (Variance from second central moment).

(C) The original raw data values.

(D) Skewness (from third central moment) and Kurtosis (from fourth central moment).

Answer:

Question 10. Which of the following is NOT a correct statement about comparing the consistency of two investment options using Coefficient of Variation?

(A) The option with the lower CV is considered more consistent or less risky (in terms of relative returns).

(B) The units of return (e.g., percentage) must be the same for both options.

(C) CV is a suitable measure even if the average returns are significantly different.

(D) The option with the higher standard deviation is always more risky in terms of returns.

Answer:

Skewness and Kurtosis

Question 1. Skewness measures the asymmetry of a distribution. Which of the following is NOT a correct statement about skewness?

(A) A symmetric distribution has zero skewness.

(B) A distribution with a longer tail to the right is negatively skewed.

(C) A distribution with a longer tail to the left is negatively skewed.

(D) Skewness indicates the direction and degree of asymmetry.

Answer:

Question 2. For a positively skewed distribution, which of the following relationships between Mean, Median, and Mode is NOT typically observed?

(A) Mean > Median.

(B) Median > Mode.

(C) Mean < Median < Mode.

(D) The Mean is pulled towards the longer tail.

Answer:

Question 3. Kurtosis measures the peakedness and tail heaviness of a distribution. Which of the following is NOT a correct statement about kurtosis?

(A) A high kurtosis value indicates a sharper peak and fatter tails than a normal distribution.

(B) A low kurtosis value indicates a flatter peak and thinner tails than a normal distribution.

(C) Kurtosis directly measures the asymmetry of the distribution.

(D) The normal distribution is considered mesokurtic.

Answer:

Question 4. Which of the following is NOT a valid method for measuring skewness?

(A) Karl Pearson's coefficient of skewness using Mean and Mode.

(B) Karl Pearson's coefficient of skewness using Mean and Median.

(C) Bowley's coefficient of skewness using quartiles.

(D) Using the second central moment (Variance).

Answer:

Question 5. For a symmetric distribution, which of the following is NOT true?

(A) The coefficient of skewness is 0.

(B) The Mean, Median, and Mode are equal.

(C) The distribution must be a Normal distribution.

(D) The left and right halves of the distribution are mirror images.

Answer:

Question 6. In the context of kurtosis, which of the following is NOT a correct description?

(A) Leptokurtic distributions have a sharp peak and fat tails.

(B) Platykurtic distributions have a flat peak and thin tails.

(C) Mesokurtic distributions have kurtosis higher than leptokurtic ones.

(D) Kurtosis focuses on the concentration of data in the peak and tails.

Answer:

Question 7. For a distribution, if Mean < Median < Mode, which of the following is NOT typically true?

(A) The distribution is positively skewed.

(B) The distribution is negatively skewed.

(C) The long tail is on the left side.

(D) The Mode is the highest value among the three measures.

Answer:

Question 8. Which of the following is NOT a correct statement about the role of skewness and kurtosis?

(A) They describe the shape of the distribution.

(B) Skewness describes asymmetry, Kurtosis describes peakedness/tailedness.

(C) They are measures of central tendency.

(D) Analyzing them helps in understanding the characteristics of the data distribution beyond just center and spread.

Answer:

Question 9. If Bowley's coefficient of skewness is calculated as 0, which of the following is NOT necessarily true?

(A) $Q_1 + Q_3 = 2Q_2$.

(B) The distribution is perfectly symmetric.

(C) The Mean, Median, and Mode are equal.

(D) The distances ($\text{Median} - Q_1$) and ($Q_3$ - $\text{Median}$) are equal.

Answer:

Question 10. Karl Pearson's coefficient of skewness is given by $\frac{\text{Mean} - \text{Mode}}{\text{Standard Deviation}}$ or $3 \frac{\text{(Mean - Median)}}{\text{Standard Deviation}}$. Which of the following is NOT a characteristic?

(A) Its sign indicates the direction of skewness.

(B) It is a dimensionless measure.

(C) It is resistant to extreme values because it uses the mean.

(D) Its value is typically between -3 and +3 (though can exceed this).

Answer:

Percentiles and Quartiles

Question 1. Which of the following is NOT a measure that divides an ordered dataset into equal parts?

(A) Quartiles.

(B) Percentiles.

(C) Deciles.

(D) Range.

Answer:

Question 2. Quartiles divide the data into four equal parts. Which of the following is NOT a correct statement about quartiles?

(A) $Q_1$ is the first quartile.

(B) $Q_2$ is the median.

(C) $Q_3$ is the third quartile.

(D) $Q_4$ is the maximum value in the dataset.

Answer:

Question 3. The $k^{th}$ percentile ($P_k$) is the value below which approximately k% of the observations fall. Which of the following is NOT a correct relationship between quartiles and percentiles?

(A) $Q_1 = P_{25}$.

(B) $Q_2 = P_{50}$.

(C) $Q_3 = P_{75}$.

(D) $Q_1 = P_{50}$.

Answer:

Question 4. The Interquartile Range (IQR) is $Q_3 - Q_1$. Which of the following is NOT true about IQR?

(A) It measures the spread of the middle 50% of the data.

(B) It is a measure of dispersion.

(C) It is heavily affected by extreme values.

(D) It is used in constructing box plots.

Answer:

Question 5. Quartile Deviation (QD) is calculated as $(Q_3 - Q_1) / 2$. Which of the following is NOT a property of QD?

(A) It is also called Semi-Interquartile Range.

(B) It is half of the IQR.

(C) It is a measure of central tendency.

(D) It is less affected by outliers than the Range.

Answer:

Question 6. For grouped data, which of the following is NOT a correct step when calculating quartiles or percentiles?

(A) Calculate cumulative frequencies.

(B) Identify the class interval containing the desired quartile/percentile based on its position (e.g., $N/4$, $kN/100$).

(C) Use the Median formula, adjusting $N/2$ to the appropriate position ($kN/100$ or $kN/4$).

(D) Use class marks directly in the calculation without using cumulative frequencies.

Answer:

Question 7. Percentile rank of a value X is the percentage of observations less than or equal to X. Which of the following is NOT a valid statement?

(A) If a score is at the 85th percentile, it is higher than 85% of scores.

(B) A score at the 50th percentile is the median.

(C) Percentile ranks range from 0 to 100.

(D) Percentile rank is calculated using the formula for percentiles.

Answer:

Question 8. Which of the following is NOT a correct statement about the relationship between quartiles and the median in a symmetric distribution?

(A) The Median is equidistant from $Q_1$ and $Q_3$.

(B) $Q_2 = \text{Median}$.

(C) $\text{Median} - Q_1 = Q_3 - \text{Median}$.

(D) $Q_1 = Q_3$.

Answer:

Question 9. Which of the following is NOT a correct statement about the use of quartiles and percentiles?

(A) They provide information about the distribution of data points.

(B) They are useful for identifying outliers (e.g., values outside 1.5 * IQR from quartiles).

(C) They can be used to calculate the mean of the distribution.

(D) They are suitable for comparing the relative standing of data points within different datasets.

Answer:

Question 10. Quartiles and percentiles can be estimated graphically from ogives. Which of the following is NOT true about this graphical estimation?

(A) $Q_1$ is estimated from $N/4$ on the cumulative frequency axis.

(B) $P_{75}$ is estimated from $3N/4$ on the cumulative frequency axis.

(C) $P_{10}$ is estimated from $N/100$ on the cumulative frequency axis.

(D) The Mean is estimated from the intersection of the two ogives.

Answer:

Correlation

Question 1. Which of the following is NOT a correct statement about correlation?

(A) It measures the strength of a linear relationship between two variables.

(B) It indicates the direction of the linear relationship.

(C) A strong correlation implies that one variable causes the other.

(D) It is represented by a coefficient that typically ranges from -1 to +1.

Answer:

Question 2. A scatter diagram is used to visualize the relationship between two variables. Which of the following is NOT a correct interpretation of a pattern in a scatter diagram?

(A) Points clustering along an upward sloping line indicate positive correlation.

(B) Points scattered randomly indicate a perfect positive correlation.

(C) Points clustering along a downward sloping line indicate negative correlation.

(D) Points forming a perfect straight line indicate perfect correlation.

Answer:

Question 3. Karl Pearson's Coefficient of Correlation ($r$) measures the strength and direction of linear association. Which of the following is NOT a possible value for '$r$'?

(A) 0.8.

(B) $-0.5$.

(C) 1.2.

(D) 0.

Answer:

Question 4. If Karl Pearson's coefficient of correlation ($r$) between two variables is found to be 0, which of the following is NOT necessarily true?

(A) There is no linear relationship between the variables.

(B) The variables are independent.

(C) The points in a scatter diagram are likely randomly scattered.

(D) There might be a non-linear relationship between the variables.

Answer:

Question 5. Spearman's Rank Correlation Coefficient is used in certain situations. Which of the following is NOT a suitable situation for using Spearman's coefficient?

(A) The data is in the form of ranks.

(B) The relationship between the variables is monotonic but not linear.

(C) The data is quantitative and the relationship is clearly linear, without outliers.

(D) When Pearson's coefficient is less appropriate due to violations of assumptions or non-linearity.

Answer:

Question 6. If two variables have a perfect positive linear correlation, which of the following is NOT true?

(A) The correlation coefficient is +1.

(B) The points in a scatter diagram lie exactly on a straight line sloping upwards.

(C) As one variable increases, the other decreases proportionally.

(D) There is a perfect linear association.

Answer:

Question 7. Which of the following is NOT a correct statement about the relationship between correlation and causation?

(A) Correlation indicates that two variables are associated.

(B) Causation implies that a change in one variable directly leads to a change in the other.

(C) Finding a strong correlation is sufficient evidence to conclude causation.

(D) It is important not to confuse correlation with causation.

Answer:

Question 8. If the correlation coefficient between hours of sleep and alertness score is $-0.7$, which of the following is NOT a valid interpretation?

(A) There is a strong negative linear relationship.

(B) As hours of sleep increase, alertness tends to decrease.

(C) As hours of sleep decrease, alertness tends to increase.

(D) Sleeping more causes reduced alertness.

Answer:

Question 9. Which of the following is NOT a measure used to quantify the strength and direction of association between two variables?

(A) Karl Pearson's coefficient.

(B) Spearman's rank correlation coefficient.

(C) Coefficient of Variation.

(D) Scatter diagram (it visualizes, doesn't quantify).

Answer:

Question 10. If the correlation coefficient between two variables is $-1$, which of the following is NOT true?

(A) There is a perfect negative linear relationship.

(B) The points in a scatter diagram lie exactly on a straight line.

(C) As one variable increases, the other increases proportionally.

(D) The relationship is perfectly linear and inverse.

Answer:

Introduction to Probability: Basic Terms and Concepts

Question 1. Which of the following is NOT a characteristic of a random experiment?

(A) All possible outcomes are known in advance.

(B) The exact outcome cannot be predicted with certainty before the experiment.

(C) The experiment can be repeated under similar conditions.

(D) The outcomes are always equally likely.

Answer:

Question 2. The sample space of a random experiment is the set of all possible outcomes. Which of the following is NOT a correct statement about the sample space?

(A) It is denoted by $\Omega$ or S.

(B) Every outcome of the experiment is an element of the sample space.

(C) It is always a finite set.

(D) It represents all potential results of the experiment.

Answer:

Question 3. An event is a subset of the sample space. Which of the following is NOT a type of event?

(A) Simple event.

(B) Compound event.

(C) Random event.

(D) Impossible event.

Answer:

Question 4. Two events A and B are mutually exclusive if they cannot occur at the same time. Which of the following is NOT a characteristic of mutually exclusive events?

(A) Their intersection is the empty set ($\emptyset$).

(B) $P(A \cap B) = 0$.

(C) They are also necessarily independent events.

(D) The occurrence of A means B cannot occur in the same trial.

Answer:

Question 5. According to the classical (theoretical) definition of probability, which of the following is NOT required?

(A) The sample space is finite.

(B) All outcomes in the sample space are equally likely.

(C) The probability of an event is the ratio of favorable outcomes to total outcomes.

(D) The probability is based on the observed frequency in a large number of trials.

Answer:

Question 6. Which of the following values CANNOT represent the probability of an event?

(A) 0.

(B) 1/2.

(C) $-0.1$.

(D) 1.

Answer:

Question 7. Experimental (Empirical) Probability is based on observations. Which of the following is NOT a correct statement about experimental probability?

(A) It is calculated as the number of times an event occurred divided by the total number of trials.

(B) It is always equal to the theoretical probability.

(C) It tends to get closer to the theoretical probability as the number of trials increases.

(D) It is based on actual results of an experiment.

Answer:

Question 8. If the probability of an event is 0, which of the following is NOT true?

(A) The event is an impossible event.

(B) The event cannot occur.

(C) The event is a sure event.

(D) The number of favorable outcomes is 0 (for finite sample space with equally likely outcomes).

Answer:

Question 9. Which of the following is NOT a correct description of a compound event?

(A) It is a subset of the sample space.

(B) It consists of only a single outcome.

(C) It consists of more than one outcome.

(D) Getting an even number when rolling a die is a compound event (outcomes {2, 4, 6}).

Answer:

Question 10. If event A is 'getting a number less than 4' and event B is 'getting a number greater than 3' when rolling a die, which of the following is NOT true about A and B?

(A) $A = \{1, 2, 3\}$.

(B) $B = \{4, 5, 6\}$.

(C) A and B are mutually exclusive.

(D) $A \cup B = \{1, 2, 3, 4, 5, 6\}$ (Sample space).

Answer:

Axiomatic Approach and Laws of Probability

Question 1. The axiomatic approach to probability is based on a set of fundamental rules. Which of the following is NOT one of the axioms?

(A) For any event A, $0 \le P(A) \le 1$.

(B) The probability of the sample space is 1, $P(\Omega) = 1$.

(C) For a sequence of pairwise mutually exclusive events $A_1, A_2, ...$, $P(\cup_{i=1}^\infty A_i) = \sum_{i=1}^\infty P(A_i)$.

(D) For any two events A and B, $P(A \cup B) = P(A) + P(B) - P(A \cap B)$.

Answer:

Question 2. The Law of Complementary Events states $P(A') = 1 - P(A)$. Which of the following is NOT implied by this law?

(A) $P(A) + P(A') = 1$.

(B) A and A' are mutually exclusive events.

(C) A and A' are independent events.

(D) The union of A and A' covers the entire sample space.

Answer:

Question 3. The Addition Law for any two events A and B is $P(A \cup B) = P(A) + P(B) - P(A \cap B)$. Which of the following is NOT true?

(A) If A and B are mutually exclusive, $P(A \cap B) = 0$.

(B) If A and B are mutually exclusive, $P(A \cup B) = P(A) + P(B)$.

(C) The term $P(A \cap B)$ is subtracted to avoid double-counting the intersection.

(D) This law is only applicable if A and B are independent.

Answer:

Question 4. If $P(A) = 0.6$ and $P(B) = 0.5$. If A and B are mutually exclusive, which of the following is NOT true?

(A) $P(A \cap B) = 0$.

(B) $P(A \cup B) = 0.6 + 0.5 = 1.1$.

(C) $P(A \cup B) = 0.6 + 0.5 = 1.1$ (but probability cannot be > 1, so the premise that $P(A)=0.6$ and $P(B)=0.5$ *while being mutually exclusive in a shared sample space* might be problematic if $P(A)+P(B)>1$, unless other events exist. However, the calculation of $P(A \cup B)$ *given* they are mutually exclusive is correct using the formula, and the error is likely in the scenario setup. Focusing on the calculation from the premise).

(D) The events A and B cannot occur simultaneously.

Answer:

Question 5. If $P(A) = 0.7$, $P(B) = 0.6$, and $P(A \cup B) = 0.8$. Which of the following is NOT true?

(A) $P(A \cap B) = P(A) + P(B) - P(A \cup B) = 0.7 + 0.6 - 0.8 = 0.5$.

(B) $P(A \cap B) = 0.5$.

(C) $P(A \cup B) = 0.7 + 0.6 = 1.3$ if A and B were mutually exclusive.

(D) $P(A \cap B) = 0.7 \times 0.6 = 0.42$ if A and B were independent.

Answer:

Question 6. In the axiomatic approach, probabilities are assigned to events. Which of the following is NOT a consequence of the axioms?

(A) $P(\emptyset) = 0$ (Probability of the impossible event).

(B) If A $\subseteq$ B, then $P(A) \le P(B)$.

(C) $P(A \cup B) = P(A) + P(B)$ for *any* two events A and B.

(D) $P(A') = 1 - P(A)$.

Answer:

Question 7. Which of the following is NOT a correct statement about the union ($A \cup B$) and intersection ($A \cap B$) of two events?

(A) $A \cup B$ means A occurs or B occurs (or both).

(B) $A \cap B$ means A and B both occur.

(C) For mutually exclusive events, $A \cup B = \emptyset$.

(D) For exhaustive events A and B (in a sample space of only A and B), $A \cup B = \Omega$.

Answer:

Question 8. If $P(A)=0.2, P(B)=0.3$, and $P(A \cap B)=0.06$. Which of the following is NOT true?

(A) $P(A \cup B) = 0.2 + 0.3 - 0.06 = 0.44$.

(B) A and B are independent ($P(A \cap B) = P(A)P(B) = 0.2 \times 0.3 = 0.06$).

(C) A and B are mutually exclusive.

(D) $P(A \cup B) = 0.44$.

Answer:

Question 9. Consider three events A, B, and C in a sample space. Which of the following is NOT a valid statement about probabilities?

(A) $P(A \cup B \cup C) = P(A) + P(B) + P(C)$ if they are pairwise mutually exclusive.

(B) $P(A \cap B \cap C) \le P(A)$.

(C) $P(A \cup B \cup C) \ge P(A)$.

(D) $P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P(A \cap B \cap C)$ (Inclusion-Exclusion Principle).

Answer:

Question 10. If events $E_1, E_2, ..., E_n$ form a partition of the sample space, which of the following is NOT true?

(A) $\cup_{i=1}^n E_i = \Omega$.

(B) $E_i \cap E_j = \emptyset$ for $i \neq j$.

(C) $\sum_{i=1}^n P(E_i) = 1$.

(D) $P(E_i) > 0$ for all i (While often true in practice for the Law of Total Prob denominator, it's not a strict definition of a partition itself - an event with 0 prob could be in the partition).

Answer:

Conditional Probability

Question 1. Conditional probability $P(A|B)$ is the probability of A given B. Which of the following is NOT a correct formula or property?

(A) $P(A|B) = \frac{P(A \cap B)}{P(B)}$, provided $P(B) \neq 0$.

(B) $P(A \cap B) = P(A|B)P(B)$.

(C) $0 \le P(A|B) \le 1$.

(D) $P(A|B) = P(B|A)$.

Answer:

Question 2. If A and B are independent events, which of the following is NOT true?

(A) $P(A|B) = P(A)$, provided $P(B) \neq 0$.

(B) $P(B|A) = P(B)$, provided $P(A) \neq 0$.

(C) $P(A \cap B) = P(A)P(B)$.

(D) $P(A|B) = 0$ if $P(A) \neq 0$ and $P(B) \neq 0$.

Answer:

Question 3. If A and B are mutually exclusive events and $P(B) \neq 0$, which of the following is NOT true?

(A) $P(A \cap B) = 0$.

(B) $P(A|B) = 0$.

(C) $P(B|A)$ is also necessarily 0 (assuming $P(A) \neq 0$).

(D) $P(A|B) = P(A)$.

Answer:

Question 4. Which of the following is NOT a correct property of conditional probability $P(A|B)$ for valid event B ($P(B) \neq 0$)?

(A) $P(\Omega|B) = 1$.

(B) $P(A'|B) = 1 - P(A|B)$.

(C) For mutually exclusive events A and C, $P(A \cup C|B) = P(A|B) + P(C|B)$.

(D) $P(A|B)$ can be greater than 1.

Answer:

Question 5. If $P(A) = 0.5$, $P(B) = 0.6$, and $P(A \cap B) = 0.3$. Which of the following is NOT true?

(A) A and B are independent ($P(A)P(B) = 0.5 \times 0.6 = 0.3 = P(A \cap B)$).

(B) $P(A|B) = P(A) = 0.5$.

(C) $P(B|A) = P(B) = 0.6$.

(D) A and B are mutually exclusive.

Answer:

Question 6. Consider drawing two cards from a deck without replacement. Let A be drawing a King on the first draw and B be drawing a Queen on the second draw. Which of the following is NOT true?

(A) A and B are dependent events.

(B) $P(A) = 4/52$.

(C) $P(B|A) = 4/51$ (Probability of drawing a Queen given a King was drawn and not replaced).

(D) $P(A \cap B) = P(A) \times P(B)$.

Answer:

Question 7. If $P(A|B) = P(A)$, and $P(A) \neq 0, P(B) \neq 0$, which of the following is NOT true?

(A) A and B are independent.

(B) $P(B|A) = P(B)$.

(C) $P(A \cap B) = P(A)P(B)$.

(D) A and B are mutually exclusive.

Answer:

Question 8. Which of the following is NOT a situation where conditional probability is directly relevant?

(A) The probability of it raining tomorrow given that it is cloudy today.

(B) The probability of getting heads on the second coin toss given the first toss was heads (for a fair coin).

(C) The probability of a student passing an exam given they studied for it.

(D) The probability of rolling a 6 on a die.

Answer:

Question 9. If $P(A \cap B) = 0$, which of the following is NOT necessarily true?

(A) A and B are mutually exclusive.

(B) $P(A \cup B) = P(A) + P(B)$.

(C) $P(A|B) = 0$ (provided $P(B) \neq 0$).

(D) A and B are independent (unless $P(A)=0$ or $P(B)=0$).

Answer:

Question 10. Which of the following is NOT a property of conditional probability derived from the axioms of probability?

(A) $P(A|B) \ge 0$ for any events A, B with $P(B) \neq 0$.

(B) $P(B|B) = 1$ for any event B with $P(B) \neq 0$.

(C) $P(\emptyset|B) = 0$ for any event B with $P(B) \neq 0$.

(D) $P(A \cap B|B) = P(A|B)P(B|B)$.

Answer:

Probability Theorems: Multiplication Law and Total Probability

Question 1. The Multiplication Law of Probability states $P(A \cap B) = P(A)P(B|A)$ or $P(A \cap B) = P(B)P(A|B)$. Which of the following is NOT a correct application or consequence of this law?

(A) Finding the probability that both A and B occur.

(B) If A and B are independent, $P(A \cap B) = P(A)P(B)$.

(C) Finding the probability that A or B occurs.

(D) Finding the probability of a sequence of events occurring (e.g., drawing without replacement).

Answer:

Question 2. Events A and B are independent if the occurrence of one does not affect the probability of the other. Which of the following is NOT a correct condition for independence?

(A) $P(A|B) = P(A)$, provided $P(B) \neq 0$.

(B) $P(B|A) = P(B)$, provided $P(A) \neq 0$.

(C) $P(A \cap B) = P(A) + P(B)$.

(D) $P(A \cap B) = P(A)P(B)$.

Answer:

Question 3. If A and B are two events such that $P(A) \neq 0$ and $P(B) \neq 0$. Which of the following is NOT true?

(A) If A and B are independent, they are not mutually exclusive (unless $P(A)=0$ or $P(B)=0$).

(B) If A and B are mutually exclusive, they are not independent (unless $P(A)=0$ or $P(B)=0$).

(C) Independent events are always mutually exclusive.

(D) Mutually exclusive events have $P(A \cap B)=0$, while independent events have $P(A \cap B)=P(A)P(B)$.

Answer:

Question 4. A collection of events $E_1, E_2, ..., E_n$ forms a partition of the sample space $\Omega$. Which of the following is NOT a required property of these events?

(A) They are mutually exclusive (pairwise disjoint).

(B) They are exhaustive (their union covers $\Omega$).

(C) $P(E_i) > 0$ for all i (not strictly required for a partition, though often assumed for theorems).

(D) $E_i \cap E_j = \emptyset$ for $i \neq j$.

Answer:

Question 5. The Law of Total Probability states that if $E_1, E_2, ..., E_n$ is a partition of $\Omega$ with $P(E_i)>0$, then $P(A) = \sum_{i=1}^n P(A|E_i)P(E_i)$ for any event A. Which of the following is NOT a correct interpretation or use?

(A) It finds the marginal probability of A.

(B) It sums the probabilities of A occurring with each of the partitioning events.

(C) It is applicable even if the partitioning events are not mutually exclusive.

(D) It requires knowing the conditional probabilities of A given the partitioning events and the prior probabilities of the partitioning events.

Answer:

Question 6. If A and B are independent events, which of the following is NOT necessarily true?

(A) A' and B are independent.

(B) A and B' are independent.

(C) A' and B' are independent.

(D) $P(A \cup B) = P(A) + P(B)$.

Answer:

Question 7. Consider drawing a marble from a bag and then drawing a second marble. Which of the following scenarios DOES NOT represent independent events?

(A) Drawing with replacement.

(B) Drawing without replacement.

(C) Drawing from one bag and then drawing from a completely different bag.

(D) Drawing with replacement.

Answer:

Question 8. If $P(A)=0.4, P(B)=0.5$, and A and B are independent. Which of the following is NOT a correct calculation?

(A) $P(A \cap B) = 0.4 \times 0.5 = 0.2$.

(B) $P(A \cup B) = 0.4 + 0.5 - 0.2 = 0.7$.

(C) $P(A|B) = 0.4$.

(D) $P(A \cup B) = 0.4 + 0.5 = 0.9$.

Answer:

Question 9. Which of the following is NOT a correct statement about conditional probability and independence?

(A) If A and B are independent, $P(A|B)=P(A)$.

(B) If $P(A|B)=P(A)$, A and B are independent (assuming probabilities > 0).

(C) If A and B are mutually exclusive and $P(A)>0, P(B)>0$, they are independent.

(D) Conditional probability is crucial for analyzing dependent events using the multiplication law.

Answer:

Question 10. If $E_1, E_2$ form a partition of $\Omega$, and $P(E_1) = 0.6$, $P(E_2) = 0.4$. If $P(A|E_1) = 0.1$ and $P(A|E_2) = 0.2$. Which of the following is NOT true?

(A) $P(E_1) + P(E_2) = 1$.

(B) $E_1$ and $E_2$ are mutually exclusive.

(C) $P(A) = P(A|E_1)P(E_1) + P(A|E_2)P(E_2) = 0.1 \times 0.6 + 0.2 \times 0.4 = 0.06 + 0.08 = 0.14$.

(D) $P(A)$ is calculated by simply summing the conditional probabilities: $0.1 + 0.2 = 0.3$.

Answer:

Bayes’ Theorem

Question 1. Bayes' Theorem is given by the formula $P(A|B) = \frac{P(B|A)P(A)}{P(B)}$. Which of the following is NOT a correct description of a term in this formula?

(A) $P(A)$: Prior probability of A.

(B) $P(B|A)$: Likelihood of B given A.

(C) $P(A|B)$: Posterior probability of A given B.

(D) $P(B)$: Conditional probability of B given A.

Answer:

Question 2. Which of the following is NOT a primary application area for Bayes' Theorem?

(A) Updating probabilities based on new evidence.

(B) Calculating the probability of the union of two events.

(C) Medical diagnosis (probability of disease given symptoms/test results).

(D) Spam filtering (probability of email being spam given its content).

Answer:

Question 3. If $E_1, E_2, ..., E_n$ form a partition of $\Omega$, and A is an event, the general form of Bayes' Theorem for $P(E_i|A)$ is $\frac{P(A|E_i)P(E_i)}{\sum_{j=1}^n P(A|E_j)P(E_j)}$. Which of the following is NOT true?

(A) The numerator is $P(A \cap E_i)$.

(B) The denominator is $P(A)$ calculated using the Law of Total Probability.

(C) The denominator represents the sum of prior probabilities.

(D) The formula calculates the posterior probability of $E_i$ given A.

Answer:

Question 4. Consider the statement: "If a test for a rare disease is positive, it is highly likely the person has the disease." Which of the following factors, when using Bayes' Theorem, might NOT support this statement?

(A) A very high true positive rate for the test.

(B) A very low prior probability of the disease in the population.

(C) A very low false positive rate for the test.

(D) The disease being very common in the population (high prior probability).

Answer:

Question 5. Bayes' Theorem allows updating beliefs. Which of the following is NOT a correct interpretation of the update process?

(A) Prior probability is updated by the evidence to get the posterior probability.

(B) The likelihood ($P(B|A)$) quantifies how well the evidence B is explained by the hypothesis A.

(C) The denominator $P(B)$ acts as a scaling factor.

(D) The posterior probability $P(A|B)$ is always equal to the prior probability $P(A)$.

Answer:

Question 6. Which of the following is NOT true about the relationship between Bayes' Theorem and Conditional Probability?

(A) Bayes' Theorem is derived from the definition of conditional probability.

(B) Bayes' Theorem provides a way to reverse conditional probabilities (e.g., finding $P(A|B)$ from $P(B|A)$).

(C) Conditional probability is only a special case of Bayes' Theorem.

(D) Both concepts are fundamental for understanding how the occurrence of one event affects the probability of another.

Answer:

Question 7. If $P(A|B) = 0.8$ and $P(A) = 0.5$. Which of the following is NOT necessarily true?

(A) The occurrence of B makes A more likely.

(B) A and B are dependent events.

(C) $P(A \cap B) = 0.8 \times P(B)$.

(D) $P(B|A) = 0.8$.

Answer:

Question 8. Bayes' Theorem allows calculation of $P(E_i|A)$ given a partition $E_1, ..., E_n$. Which of the following is NOT required to calculate this?

(A) Prior probabilities $P(E_i)$ for all events in the partition.

(B) Likelihoods $P(A|E_i)$ for all events in the partition.

(C) The marginal probability $P(A)$.

(D) The conditional probability $P(E_j|A)$ for all $j \neq i$ (not directly needed for $P(E_i|A)$ formula, only implicitly in the denominator sum).

Answer:

Question 9. Which of the following is NOT a scenario where a Bayesian approach might be particularly valuable?

(A) When there is prior knowledge or belief that needs to be updated with data.

(B) When dealing with decision-making under uncertainty.

(C) When comparing the means of two independent groups (often done with frequentist t-tests).

(D) When trying to determine the most likely cause of an observed event.

Answer:

Question 10. If $P(A) = 0.2, P(B) = 0.4, P(A \cup B) = 0.5$. Which of the following probabilities is NOT correctly calculated?

(A) $P(A \cap B) = P(A) + P(B) - P(A \cup B) = 0.2 + 0.4 - 0.5 = 0.1$.

(B) $P(A|B) = P(A \cap B) / P(B) = 0.1 / 0.4 = 0.25$.

(C) $P(B|A) = P(A \cap B) / P(A) = 0.1 / 0.2 = 0.5$.

(D) $P(A|B) = P(A)P(B|A) / P(B) = 0.2 \times (0.1/0.2) / 0.4 = 0.2 \times 0.5 / 0.4 = 0.1 / 0.4 = 0.25$. This is correct, but the question asks what is *not* correctly calculated or stated.

Answer:

Random Variables and Probability Distributions

Question 1. A random variable assigns a numerical value to each outcome of a random experiment. Which of the following is NOT a type of random variable?

(A) Discrete random variable.

(B) Continuous random variable.

(C) Deterministic variable.

(D) Quantifiable variable.

Answer:

Question 2. A discrete random variable can take specific, distinct values. Which of the following is NOT a characteristic of a discrete random variable?

(A) The values can be listed (finite or countably infinite).

(B) It can take any value within a given range or interval.

(C) Examples include the number of defective items or the number of calls received.

(D) Probabilities are described by a Probability Mass Function (PMF).

Answer:

Question 3. A continuous random variable can take any value within an interval. Which of the following is NOT true about a continuous random variable?

(A) The probability of the variable taking any single specific value is 0.

(B) Probabilities are represented by the area under the Probability Density Function (PDF).

(C) Examples include height, weight, and temperature.

(D) The sum of probabilities for all possible values is calculated by summing $P(X=x_i)$.

Answer:

Question 4. A Probability Distribution describes the likelihood of the possible values of a random variable. Which of the following is NOT a necessary property of a probability distribution for a discrete random variable X?

(A) $0 \le P(X=x) \le 1$ for all possible values of x.

(B) The sum of all probabilities $\sum P(X=x) = 1$.

(C) The expected value $E(X)$ must be an integer.

(D) The list of values of x covers all possible outcomes of the random variable.

Answer:

Question 5. The Cumulative Distribution Function (CDF), $F(x)$, for a random variable X is defined as $P(X \le x)$. Which of the following is NOT true about the CDF?

(A) It is a non-decreasing function.

(B) Its value always ranges between 0 and 1.

(C) For a discrete random variable, $F(x)$ is a step function.

(D) For a continuous random variable, the probability $P(a < X \le b)$ is equal to $F(a) - F(b)$.

Answer:

Question 6. Which of the following scenarios would NOT typically be modeled using a discrete random variable?

(A) The number of heads in 10 coin flips.

(B) The number of cars that pass a point in an hour.

(C) The time it takes for a reaction to complete.

(D) The number of students present in a class on a given day.

Answer:

Question 7. Which of the following scenarios would NOT typically be modeled using a continuous random variable?

(A) The height of plants after a month.

(B) The exact amount of milk in a jug.

(C) The number of defects in a manufactured item.

(D) The temperature of the room at a specific time.

Answer:

Question 8. For a continuous random variable X, which of the following is NOT true about its probability distribution?

(A) The total area under the PDF curve is 1.

(B) $P(X=c) = 0$ for any constant c.

(C) The PDF $f(x)$ represents the probability $P(X=x)$.

(D) The probability of X falling in an interval $[a, b]$ is the area under the PDF curve from a to b.

Answer:

Question 9. If the probability distribution of X is:

x	0	1	2
P(X=x)	0.3	0.4	0.5

Which of the following is NOT true?

(A) X is a discrete random variable.

(B) The sum of probabilities $0.3+0.4+0.5 = 1.2$, which is greater than 1.

(C) This is a valid probability distribution.

(D) The possible values of X are 0, 1, and 2.

Answer:

Question 10. In Applied Mathematics, the study of probability distributions is important for modeling real-world phenomena. Which of the following is NOT a primary goal in this context?

(A) Understanding the likelihood of different outcomes.

(B) Calculating measures like expected value and variance.

(C) Perfectly predicting the outcome of a single random event.

(D) Making inferences or decisions based on probabilistic models.

Answer:

Measures of Probability Distributions: Expectation and Variance

Question 1. Which of the following is NOT a correct statement about the Expected Value ($E(X)$) of a random variable X?

(A) It represents the theoretical mean of the distribution.

(B) For a discrete random variable, $E(X) = \sum x_i P(X=x_i)$.

(C) It is the value that the random variable is most likely to take in a single trial.

(D) It represents the long-run average value of X.

Answer:

Question 2. The Variance ($Var(X)$) measures the spread of a random variable's distribution. Which of the following is NOT a correct statement about Variance?

(A) It is calculated as the average of the squared deviations from the mean.

(B) $Var(X) = E(X^2) - [E(X)]^2$.

(C) Its units are the same as the units of the random variable X.

(D) It is always non-negative.

Answer:

Question 3. The Standard Deviation ($\sigma_X$) is the positive square root of the Variance. Which of the following is NOT true about the Standard Deviation?

(A) It is a measure of dispersion.

(B) Its units are the same as the random variable's units.

(C) It is always greater than or equal to the Variance.

(D) It provides a sense of the typical distance of outcomes from the mean.

Answer:

Question 4. For a random variable X and constants a and b, which of the following properties of Expectation and Variance is NOT correct?

(A) $E(aX + b) = a E(X) + b$.

(B) $Var(aX + b) = a^2 Var(X)$.

(C) $E(c) = c$ for a constant c.

(D) $Var(c) = c$ for a constant c.

Answer:

Question 5. If the probability distribution of X is:

x	1	2	3
P(X=x)	0.2	0.3	0.5

Which of the following is NOT correctly calculated?

(A) $E(X) = 1 \times 0.2 + 2 \times 0.3 + 3 \times 0.5 = 0.2 + 0.6 + 1.5 = 2.3$.

(B) $E(X^2) = 1^2 \times 0.2 + 2^2 \times 0.3 + 3^2 \times 0.5 = 0.2 + 1.2 + 4.5 = 5.9$.

(C) $Var(X) = E(X^2) - [E(X)]^2 = 5.9 - (2.3)^2 = 5.9 - 5.29 = 0.61$.

(D) Standard Deviation = 0.61.

Answer:

Question 6. If $E(X) = 5$ and $Var(X) = 3$. Which of the following is NOT a correct calculation?

(A) $E(2X) = 2 \times 5 = 10$.

(B) $Var(2X) = 2 \times 3 = 6$.

(C) $E(X+4) = 5 + 4 = 9$.

(D) $Var(X+4) = 3$.

Answer:

Question 7. Which of the following is NOT true about the expected value of a discrete random variable?

(A) It is a measure of central tendency.

(B) It is always one of the possible values that the random variable can take.

(C) It is the weighted average of the possible values, with probabilities as weights.

(D) It represents the theoretical mean of the distribution.

Answer:

Question 8. If $Var(X) = 0$. Which of the following is NOT necessarily true?

(A) The random variable X is a constant.

(B) The Standard Deviation is 0.

(C) $E(X) = 0$.

(D) There is no variability in the possible outcomes of X.

Answer:

Question 9. For two independent random variables X and Y, which of the following properties is NOT correct?

(A) $E(X+Y) = E(X) + E(Y)$.

(B) $Var(X+Y) = Var(X) + Var(Y)$.

(C) $E(XY) = E(X)E(Y)$.

(D) $Var(X-Y) = Var(X) - Var(Y)$.

Answer:

Question 10. Which of the following is NOT a correct interpretation of the Standard Deviation of a probability distribution?

(A) It measures the spread of the distribution around the mean.

(B) A larger standard deviation means the outcomes are more spread out.

(C) It is in the same units as the expected value.

(D) It is the average of the squared deviations from the mean.

Answer:

Binomial Distribution

Question 1. A Binomial distribution models the number of successes in a fixed number of trials. Which of the following is NOT a requirement for a Binomial experiment (Bernoulli trials)?

(A) Each trial has exactly two possible outcomes.

(B) The probability of success changes from trial to trial.

(C) The trials are independent.

(D) The number of trials is fixed.

Answer:

Question 2. The parameters of a Binomial distribution are n (number of trials) and p (probability of success). Which of the following is NOT a correct formula for measures of a Binomial distribution?

(A) $\text{Mean} = np$.

(B) $\text{Variance} = np(1-p)$.

(C) $\text{Standard Deviation} = \sqrt{np}$.

(D) $\text{Variance} = npq$, where $q = 1-p$.

Answer:

Question 3. The probability mass function (PMF) of a Binomial distribution is $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$. Which of the following is NOT a correct interpretation of a component?

(A) $\binom{n}{k}$ is the number of ways to get k successes in n trials.

(B) $p^k$ is the probability of getting k successes in a specific order.

(C) $(1-p)^{n-k}$ is the probability of getting n-k failures in a specific order.

(D) k represents the number of failures.

Answer:

Question 4. The possible values of a Binomial random variable $X \sim B(n, p)$ are the number of successes. Which of the following is NOT a possible value for X?

(A) 0.

(B) Any integer between 0 and n, inclusive.

(C) n.

(D) Any positive real number up to n.

Answer:

Question 5. Which of the following scenarios would NOT typically be modeled using a Binomial distribution?

(A) The number of heads in 20 coin tosses.

(B) The number of defective items in a random sample of 50 from a large production batch (assuming constant defect rate).

(C) The time it takes for a train to arrive at a station.

(D) The number of students who pass an exam in a class of 30, given each student's passing is independent with constant probability.

Answer:

Question 6. Which of the following is NOT true about the shape of a Binomial distribution?

(A) It is symmetric when $p = 0.5$.

(B) It is positively skewed when $p < 0.5$.

(C) It is negatively skewed when $p > 0.5$.

(D) It is always perfectly symmetric regardless of p, as long as n is large.

Answer:

Question 7. As the number of trials (n) in a Binomial distribution increases, which of the following approximations is NOT typically used?

(A) Approximation by the Normal distribution (for large n, p not near 0 or 1).

(B) Approximation by the Poisson distribution (for large n, small p).

(C) Approximation by the Uniform distribution.

(D) The distribution becomes more symmetric and bell-shaped as n grows.

Answer:

Question 8. If a Binomial distribution has $\text{Mean} = 4$ and $\text{Variance} = 2.4$. Which of the following is NOT correctly derived?

(A) $np = 4$.

(B) $npq = 2.4$.

(C) $q = \text{Variance} / \text{Mean} = 2.4 / 4 = 0.6$.

(D) $p = 1 - q = 1 - 0.6 = 0.4$.

Answer:

Question 9. Consider a Binomial experiment with n=5 trials and probability of success p=0.3. Which of the following is NOT a valid probability calculation using the PMF?

(A) $P(X=0) = \binom{5}{0} (0.3)^0 (0.7)^5$.

(B) $P(X=5) = \binom{5}{5} (0.3)^5 (0.7)^0$.

(C) $P(X=k) < 0$ for some value of k.

(D) The sum of probabilities for k=0, 1, 2, 3, 4, 5 is 1.

Answer:

Question 10. Which of the following is NOT true about Bernoulli trials?

(A) They are independent.

(B) There are more than two possible outcomes.

(C) The probability of success is constant for each trial.

(D) Each trial is a Bernoulli trial in a Binomial experiment.

Answer:

Poisson Distribution

Question 1. The Poisson distribution models the number of occurrences of events in a fixed interval. Which of the following is NOT a key characteristic of a Poisson process or distribution?

(A) Events occur randomly.

(B) Events occur independently.

(C) The average rate of occurrence is constant over the interval.

(D) The number of trials is fixed and finite.

Answer:

Question 2. The Poisson distribution is a discrete probability distribution. Which of the following is NOT a possible value for a Poisson random variable?

(A) 0.

(B) Any positive integer (1, 2, 3...).

(C) $-1$.

(D) Any non-negative integer (0, 1, 2...).

Answer:

Question 3. The parameter of the Poisson distribution is $\lambda$. Which of the following is NOT true about $\lambda$?

(A) It represents the average rate of occurrence.

(B) It must be a positive value.

(C) It is equal to the Variance of the distribution.

(D) It represents the probability of success in a single trial.

Answer:

Question 4. The probability mass function (PMF) for a Poisson distribution is $P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$. Which of the following is NOT a correct statement?

(A) $k$ is the number of occurrences.

(B) $e$ is the base of the natural logarithm.

(C) $k!$ is the factorial of $k$.

(D) $\lambda$ is the standard deviation.

Answer:

Question 5. Which of the following is NOT an example of a phenomenon that could potentially be modeled using a Poisson distribution?

(A) The number of accidents at a factory per year.

(B) The number of alpha particles emitted by a radioactive source in a given time interval.

(C) The maximum height of students in a class.

(D) The number of errors per page in a typed document.

Answer:

Question 6. The Poisson distribution can approximate the Binomial distribution $B(n, p)$. Which of the following conditions is NOT typically required for this approximation to be good?

(A) $n$ is large.

(B) $p$ is small.

(C) $np$ is large (e.g., $np > 10$).

(D) $np$ is a moderate value (e.g., $np < 5$ or $10$ is a common guideline).

Answer:

Question 7. If a Poisson distribution has a variance of 5, which of the following is NOT true?

(A) The mean is 5.

(B) The standard deviation is $\sqrt{5}$.

(C) The average rate of occurrence is 5.

(D) The distribution is symmetric.

Answer:

Question 8. For small values of $\lambda$, the Poisson distribution is skewed. Which of the following is NOT a correct statement about the skewness of the Poisson distribution?

(A) It is positively skewed for small $\lambda$.

(B) As $\lambda$ increases, it becomes less skewed.

(C) It is negatively skewed for small $\lambda$.

(D) It approaches symmetry as $\lambda$ gets larger.

Answer:

Question 9. Which of the following is NOT a difference between the Binomial and Poisson distributions?

(A) Binomial has a fixed number of trials; Poisson does not (or considers trials in a continuous interval).

(B) Binomial models the number of successes; Poisson models the number of occurrences.

(C) The possible values for Binomial are 0 to n; for Poisson, they are non-negative integers (infinite range).

(D) Binomial is continuous; Poisson is discrete.

Answer:

Question 10. If the number of events in a given interval follows a Poisson distribution with parameter $\lambda$, which of the following is NOT true about the number of events in an interval twice as long?

(A) It will also follow a Poisson distribution.

(B) Its parameter will be $2\lambda$.

(C) Its mean will be $2\lambda$.

(D) Its variance will be $4\lambda$.

Answer:

Normal Distribution

Question 1. Which of the following is NOT a correct statement about the Normal distribution?

(A) It is a discrete probability distribution.

(B) Its graph is symmetric and bell-shaped.

(C) The Mean, Median, and Mode are equal.

(D) The total area under the curve is 1.

Answer:

Question 2. The Normal distribution is characterized by its parameters, the mean ($\mu$) and standard deviation ($\sigma$). Which of the following is NOT true?

(A) $\mu$ determines the center of the distribution.

(B) $\sigma$ determines the spread of the distribution.

(C) The shape of the distribution changes with changes in $\mu$ but not $\sigma$.

(D) A larger $\sigma$ results in a flatter and wider curve.

Answer:

Question 3. The Standard Normal distribution (Z-distribution) is a special case of the Normal distribution. Which of the following is NOT a property of the Standard Normal distribution?

(A) Mean ($\mu$) = 0.

(B) Standard Deviation ($\sigma$) = 1.

(C) Total area under the curve = 1.

(D) It is skewed to the right.

Answer:

Question 4. A z-score standardizes a value x from a Normal distribution. Which of the following is NOT a correct statement about z-scores?

(A) The formula is $z = (x - \mu) / \sigma$.

(B) A positive z-score means the value is above the mean.

(C) A negative z-score means the value is below the mean.

(D) The unit of a z-score is the same as the unit of the original data.

Answer:

Question 5. The area under the Normal curve represents probability. Which of the following is NOT a correct statement about probabilities and the Normal distribution?

(A) The probability of the variable falling between two values a and b is the area under the curve between a and b.

(B) $P(X=c) = 0$ for any specific value c.

(C) The probability of X being less than the mean is 0.5.

(D) The area to the right of a value is always greater than the area to the left of that value.

Answer:

Question 6. The Empirical Rule (68-95-99.7 rule) applies to Normal distributions. Which of the following is NOT correctly stated by this rule?

(A) Approx 68% of data within $\pm 1$ SD of the mean.

(B) Approx 95% of data within $\pm 2$ SDs of the mean.

(C) Approx 99% of data within $\pm 3$ SDs of the mean (It's 99.7%).

(D) Almost all data is within $\pm 3$ SDs of the mean.

Answer:

Question 7. Which of the following is NOT a reason why the Normal distribution is important in statistics?

(A) Many natural phenomena are approximately normally distributed.

(B) The Central Limit Theorem relates sample means to the Normal distribution.

(C) It is the only distribution used in statistical inference.

(D) It has well-defined mathematical properties and associated tables (Z-tables).

Answer:

Question 8. If a dataset is known to be normally distributed, which of the following is NOT true?

(A) Its skewness is 0.

(B) Its kurtosis (excess) is 0.

(C) Mean, Median, and Mode are equal.

(D) It must be a discrete variable.

Answer:

Question 9. Z-tables (Standard Normal tables) are used to find probabilities associated with z-scores. Which of the following is NOT a correct use of a standard Z-table (typically cumulative from the left)?

(A) Finding $P(Z \le z)$.

(B) Finding $P(Z > z)$ by calculating $1 - P(Z \le z)$.

(C) Finding $P(a \le X \le b)$ for a Normal variable X.

(D) Finding the exact probability of a specific value $P(Z=z)$.

Answer:

Question 10. Which of the following is NOT a characteristic of the tails of the Normal distribution?

(A) They are symmetric.

(B) They extend infinitely.

(C) They touch the x-axis at $\pm 3\sigma$.

(D) They approach the x-axis asymptotically.

Answer:

Inferential Statistics: Population, Sample, and Parameters

Question 1. Which of the following is NOT a correct distinction between a population and a sample?

(A) Population is the entire group, sample is a subset.

(B) We typically study the population when it is too large or expensive to study the sample.

(C) The sample is drawn from the population.

(D) We use sample data to make inferences about the population.

Answer:

Question 2. Which of the following is NOT a correct distinction between a parameter and a statistic?

(A) A parameter describes a population, a statistic describes a sample.

(B) A parameter is usually known, while a statistic is estimated from data.

(C) $\mu$ is a parameter, $\bar{x}$ is a statistic.

(D) $\sigma^2$ is a parameter, $s^2$ is a statistic.

Answer:

Question 3. Inferential statistics involves making inferences about a population based on a sample. Which of the following is NOT a primary goal of inferential statistics?

(A) Estimating population parameters.

(B) Testing hypotheses about population parameters.

(C) Summarizing and visualizing sample data using descriptive statistics.

(D) Predicting population characteristics from sample data.

Answer:

Question 4. Sampling is the process of selecting a sample. Which of the following is NOT a goal of using appropriate sampling techniques?

(A) To obtain a representative sample.

(B) To reduce sampling variability as much as possible (though not eliminate it).

(C) To introduce bias into the selection process.

(D) To ensure valid inferences about the population.

Answer:

Question 5. Which of the following is NOT considered a probabilistic sampling technique (where selection is based on random chance)?

(A) Simple Random Sampling.

(B) Stratified Sampling.

(C) Convenience Sampling.

(D) Cluster Sampling.

Answer:

Question 6. Sampling variability is the natural variation among sample statistics. Which of the following is NOT true about sampling variability?

(A) It is reduced by increasing the sample size.

(B) It is accounted for in inferential statistics.

(C) It means that different random samples from the same population will likely yield different statistics.

(D) It implies that sample statistics are perfectly accurate estimates of population parameters.

Answer:

Question 7. Which of the following is NOT a valid reason for choosing to study a sample instead of conducting a census (studying the entire population)?

(A) The population is too large to enumerate entirely.

(B) Data collection from the whole population is too costly.

(C) Studying the sample is always more accurate than studying the population.

(D) Data collection is destructive (e.g., testing the lifespan of items).

Answer:

Question 8. If a researcher is studying the average income of households in Kerala, which of the following is NOT correct?

(A) The population is all households in Kerala.

(B) A sample is a subset of households selected for study.

(C) The average income of the sample is a parameter.

(D) The actual average income of all households in Kerala is a parameter.

Answer:

Question 9. Which of the following symbols DOES NOT represent a population parameter?

(A) $\mu$ (population mean).

(B) $\sigma$ (population standard deviation).

(C) p (population proportion).

(D) $\bar{x}$ (sample mean).

Answer:

Question 10. Which of the following symbols DOES NOT represent a sample statistic?

(A) $\bar{x}$.

(B) s.

(C) $\hat{p}$ (sample proportion).

(D) $\sigma^2$.

Answer:

Inferential Statistics: Concepts and Hypothesis Testing

Question 1. Hypothesis testing is a formal procedure to test claims about population parameters. Which of the following is NOT a key component of a hypothesis test?

(A) Null hypothesis ($H_0$).

(B) Alternative hypothesis ($H_1$).

(C) Sample size (n).

(D) Population parameter value (that is assumed true under $H_1$).

Answer:

Question 2. The null hypothesis ($H_0$) and the alternative hypothesis ($H_1$) are statements about population parameters. Which of the following is NOT a correct property of $H_0$ and $H_1$?

(A) They are mutually exclusive.

(B) They are collectively exhaustive (cover all possibilities for the parameter).

(C) $H_0$ always states there is a difference or relationship.

(D) $H_1$ contradicts $H_0$.

Answer:

Question 3. A Type I Error occurs when $H_0$ is rejected when it is true. Which of the following is NOT true about Type I error?

(A) Its probability is denoted by $\alpha$.

(B) It is also called the level of significance.

(C) It is the probability of failing to reject $H_0$ when $H_0$ is false.

(D) It is the risk of incorrectly rejecting a true null hypothesis.

Answer:

Question 4. A Type II Error occurs when you fail to reject $H_0$ when $H_0$ is false. Which of the following is NOT true about Type II error?

(A) Its probability is denoted by $\beta$.

(B) It is the probability of incorrectly failing to detect a real effect or difference.

(C) It is controlled directly by the level of significance $\alpha$.

(D) It occurs when the test statistic does not fall into the critical region, but $H_0$ is false.

Answer:

Question 5. The p-value is crucial for making a decision in hypothesis testing. Which of the following is NOT a correct interpretation of the p-value?

(A) The probability of observing the sample data (or more extreme) if the null hypothesis is true.

(B) The probability that the null hypothesis is true.

(C) Provides evidence against the null hypothesis (smaller p-value = stronger evidence).

(D) It is compared to the level of significance ($\alpha$) to make a decision.

Answer:

Question 6. If the p-value is less than or equal to the level of significance ($\alpha$), the decision is to reject the null hypothesis. Which of the following is NOT implied by this decision?

(A) There is statistically significant evidence against $H_0$ at the $\alpha$ level.

(B) The observed data is unlikely under the assumption that $H_0$ is true.

(C) The alternative hypothesis $H_1$ is proven to be true.

(D) The result is statistically significant.

Answer:

Question 7. If the p-value is greater than the level of significance ($\alpha$), the decision is to fail to reject the null hypothesis. Which of the following is NOT implied by this decision?

(A) The sample data does not provide sufficient evidence to reject $H_0$ at the $\alpha$ level.

(B) The null hypothesis is proven to be true.

(C) The observed data is reasonably likely under the assumption that $H_0$ is true.

(D) There is no statistically significant evidence against $H_0$ at the $\alpha$ level.

Answer:

Question 8. Which of the following is NOT a correct statement about the critical region (rejection region) of a hypothesis test?

(A) It is the region of the test statistic's sampling distribution that leads to rejecting $H_0$.

(B) Its size is determined by the level of significance $\alpha$ and the type of test (one/two-tailed).

(C) If the calculated test statistic falls into the critical region, we fail to reject $H_0$.

(D) It represents the values of the test statistic that are considered unlikely if $H_0$ is true.

Answer:

Question 9. Which of the following is NOT a correct pairing of a test type with its alternative hypothesis structure?

(A) Two-tailed test: $H_1: \theta \neq \theta_0$.

(B) Right-tailed test: $H_1: \theta > \theta_0$.

(C) Left-tailed test: $H_1: \theta < \theta_0$.

(D) One-tailed test: $H_1: \theta \neq \theta_0$.

Answer:

Question 10. The power of a statistical test is the probability of correctly rejecting the null hypothesis when it is false ($1 - \beta$). Which of the following is NOT true about the power of a test?

(A) It is the probability of avoiding a Type II error.

(B) It is related to the sample size, effect size, and $\alpha$ level.

(C) A higher power means a lower chance of detecting a real effect.

(D) Researchers generally aim for a test with high power.

Answer:

Inferential Statistics: t-Test

Question 1. The t-distribution is used in t-tests. Which of the following is NOT a characteristic of the t-distribution?

(A) It is symmetric and bell-shaped.

(B) Its shape is determined by the degrees of freedom.

(C) It has thinner tails than the Standard Normal distribution for small degrees of freedom.

(D) It approaches the Standard Normal distribution as the degrees of freedom increase.

Answer:

Question 2. Which of the following is NOT a situation where a t-test is typically appropriate?

(A) Comparing a sample mean to a known population mean when the population standard deviation is unknown.

(B) Comparing the means of two independent groups when the population standard deviations are unknown.

(C) Comparing the means of paired or dependent samples.

(D) Comparing population proportions.

Answer:

Question 3. The degrees of freedom for a t-test depend on the sample size(s) and the specific test. Which of the following is NOT a correct formula for degrees of freedom?

(A) One-sample t-test with sample size n: $df = n-1$.

(B) Two independent samples t-test (equal variances) with sizes $n_1, n_2$: $df = n_1 + n_2 - 2$.

(C) Paired samples t-test with n pairs: $df = n-2$ (it should be $n-1$).

(D) Degrees of freedom influence the critical t-value and the shape of the t-distribution.

Answer:

Question 4. A one-sample t-test is used to test a hypothesis about a population mean $\mu$. Which of the following is NOT a necessary input for performing this test?

(A) The sample mean ($\bar{x}$).

(B) The hypothesized population mean ($\mu_0$) from the null hypothesis.

(C) The population standard deviation ($\sigma$).

(D) The sample size (n).

Answer:

Question 5. A two independent samples t-test compares the means of two groups. Which of the following is NOT a key assumption for the standard (pooled variance) version of this test?

(A) The two samples are independent.

(B) The data in each group are sampled from populations that are approximately normally distributed.

(C) The population means of the two groups are equal.

(D) The population variances of the two groups are equal.

Answer:

Question 6. The t-test is considered robust to violations of the normality assumption under certain conditions. Which of the following is NOT true about this robustness?

(A) It is more robust with larger sample sizes.

(B) Moderate skewness or outliers have less impact with large samples.

(C) Severe non-normality or outliers will not affect the validity of the t-test results at all.

(D) The Central Limit Theorem helps explain this robustness for sample means.

Answer:

Question 7. Which of the following is NOT a valid hypothesis structure for a one-sample t-test about a population mean $\mu_0$?

(A) $H_0: \mu = \mu_0$ vs $H_1: \mu \neq \mu_0$.

(B) $H_0: \mu \ge \mu_0$ vs $H_1: \mu < \mu_0$.

(C) $H_0: \mu \le \mu_0$ vs $H_1: \mu > \mu_0$.

(D) $H_0: \bar{x} = \mu_0$ vs $H_1: \bar{x} \neq \mu_0$ (Hypotheses are about population parameters, not sample statistics).

Answer:

Question 8. If a t-test results in a p-value of 0.04 and the chosen level of significance $\alpha = 0.05$, which of the following is NOT a correct conclusion?

(A) The result is statistically significant at the 5% level.

(B) We reject the null hypothesis.

(C) We fail to reject the null hypothesis because the p-value is less than $\alpha$.

(D) There is sufficient evidence to support the alternative hypothesis at the 5% level.

Answer:

Question 9. The t-statistic measures how many standard errors the sample mean is away from the hypothesized population mean. Which of the following is NOT a component needed to calculate the t-statistic for a one-sample t-test?

(A) Sample mean ($\bar{x}$).

(B) Population standard deviation ($\sigma$).

(C) Hypothesized population mean ($\mu_0$).

(D) Sample standard deviation (s).

Answer:

Question 10. When comparing the average marks of two independent groups of students using a t-test, which of the following is NOT a typical assumption made?

(A) The marks in each group are normally distributed.

(B) The variance of marks in the two populations is equal (for the pooled variance test).

(C) The samples are drawn randomly and independently.

(D) The mean marks in the two populations are significantly different.

Answer:

Negative Questions MCQs for Sub-Topics of Topic 16: Statistics & Probability

Introduction to Statistics: Data and Organization

Frequency Distributions: Tables and Types

Graphical Representation of Data: Basic Charts

Graphical Representation: Frequency Distributions

Graphical Representation: Cumulative Frequency Graphs

Measures of Central Tendency: Introduction and Mean

Measures of Central Tendency: Median

Measures of Central Tendency: Mode and Relationship

Measures of Dispersion: Range and Mean Deviation

Measures of Dispersion: Variance and Standard Deviation

Measures of Relative Dispersion and Moments

Skewness and Kurtosis

Percentiles and Quartiles

Correlation

Introduction to Probability: Basic Terms and Concepts

Axiomatic Approach and Laws of Probability

Conditional Probability

Probability Theorems: Multiplication Law and Total Probability

Bayes’ Theorem

Random Variables and Probability Distributions

Measures of Probability Distributions: Expectation and Variance

Binomial Distribution

Poisson Distribution

Normal Distribution

Inferential Statistics: Population, Sample, and Parameters

Inferential Statistics: Concepts and Hypothesis Testing

Inferential Statistics: t-Test